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A common belief is that evolution generally proceeds 
towards greater complexity at both the organismal and 
the genomic level, numerous examples of reductive 
evolution of parasites and symbionts notwithstanding. 
However, recent evolutionary reconstructions challenge 
this notion. Two notable examples are the reconstruction 
of the complex archaeal ancestor and the intron-rich 
ancestor of eukaryotes. In both cases, evolution in most 
of the lineages was apparently dominated by extensive 
loss of genes and introns, respectively. These and many 
other cases of reductive evolution are consistent with a 
general model composed of two distinct evolutionary 
phases: the short, explosive, innovation phase that leads 
to an abrupt increase in genome complexity, followed by 
a much longer reductive phase, which encompasses 
either a neutral ratchet of genetic material loss or adaptive 
genome streamlining. Quantitatively, the evolution of 
genomes appears to be dominated by reduction and 
simplification, punctuated by episodes of 
complexification. 
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Introduction: Complexity can either 
increase or decrease during the 
evolution of various life forms 

The textbook depiction of the evolution of life on earth is 
that of an ascent toward a steadily increasing organismal 
complexity: from primitive protocells to prokaryotic cells 
to the eukaryotic cell to multicellular organisms to animals 
to humans, the crowning achievement of the entire history 
of life. On general grounds, this "progressivist" view of 
evolution has been repeatedly challenged, in particular in 
the eloquent writings of Gould [1]. Gould argued that the 
average complexity of life forms has barely increased over 
the course of the history of life, even as the upper bound 
of complexity was being pushed upwards, perhaps for 
purely stochastic reasons, under a "drunkard's walk" model 
of evolution. 

It has been well known for decades that the evolution of 
numerous parasitic and symbiotic organisms entails simplifi- 
cation rather than complexification. In particular, bacteria 
that evolve from free-living forms to obligate intracellular 
parasites can lose up to 95% of their gene repertoires without 
compromising the ancestral set of highly conserved genes 
involved in core cellular functions [2, 3]. The mitochondria, 
the ubiquitous energy-transforming organelles of eukaryotes, 
and the chloroplasts, the organelles responsible for the 
eukaryotic photosynthesis, are the ultimate realizations of 
bacterial reductive evolution [4, 5]. However, such reductive 
evolution, its paramount importance for eukaryotes notwith- 
standing, was considered to represent a highly specialized 
trend in the history of life. 

From a more general standpoint, there are effectively 
irrefutable arguments for a genuine increase in complexity 
during evolution. Indeed, the successive emergence of higher 
grades of complexity throughout the history of life is 
impossible to ignore. Thus, unicellular eukaryotes that, 
regardless of the exact dating, evolved more than a billion 
years after the prokaryotes, obviously attained a new level of 
complexity, and multicellular eukaryotic forms, appearing 
even later, by far exceeded the complexity of the unicellular 
ones [5-8]. Arguably, the most compelling is the argument 
from the origin of cellular life itself: before the first cells 
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emerged, there must have been some much simpler (pre) 
biological replicating entities. 

Organismal complexity is hard to define 
but genomic complexity is much more 
tractable 

Complexity is one of those all-important characteristics of any 
system that seems to be easily grasped intuitively ("we know it 
when we see it") but is notoriously difficult to capture in a 
single, quantitative and constructive definition [9, 10]. The 
approach that comes the closest to meeting these criteria 
might involve the quantity known as Kolmogorov complexity 
(also known as algorithmic entropy), which is defined as the 
length of the shortest possible description of a system (often 
represented as a string of symbols) [11]. However, Kolmogorov 
complexity is generally incomputable, and the concept is 
particularly difficult to apply to biological systems because 
of the non-trivial connection between the "description" (the 
genome) and the system itself (organismal phenotype). A 
useful practical approach to quantify the complexity of a 
system is to count the number of distinct parts of which 
it consists, and this is how organismal complexity is usually 
addressed by those that attempt to analyze it in a (semi) 
quantitative manner [12, 13]. Recently, McShea and Brandon 
[13] formulated the "First Law of Biology", or the "Zero Force 
Law of Evolution" according to which unconstrained evolu- 
tion leads to a monotonic increase in the average organismal 
complexity, due to purely the increase of entropy with time 
that is mandated by the second law of thermodynamics for 
any closed system. 

However, the utility of equating complexity with entropy 
is dubious at best as becomes particularly clear when one 
attempts to define genomic complexity. Indeed, using 
sequence entropy (Shannon information) as a measure of 
genomic complexity is obviously disingenuous given that 
under this approach the most complex sequence is a truly 
random one that, almost by definition is devoid of any 
biological information. Hence, attempts have been made to 
derive a measure of biological complexity of a genome by 
equating it with the number of sites that are subject to 
evolutionary constraints, i.e. evolve under purifying selec- 
tion [8, 14, 15]. Although this definition of genomic complexity 
certainly is over-simplified, it shows intuitively reasonable 
trends, i.e. a general tendency to increase with organismal 
complexity [8]. Moreover, introducing the additional defini- 
tion of biological information density, that is per-site 
complexity, one can, at least in principle, describe distinct 
trends in genome evolution such as a trend toward high 
information density that is common in prokaryotes and the 
contrasting trend toward high complexity at low density 
that is typical in multicellular organisms [8]. At a coarse-grain 
level, biological complexity of a genome can be redefined 
as the number of genes that are conserved at a defined 
evolutionary distance. Unlike the number of sites that are 
subject to selection, the conserved genes are rather easy to 
count, so this quantity became the basis for many recon- 
structions of genome evolution [16, 17]. 



The relationship between genomic complexity and the 
complexity at various levels of the phenotype, from molecular 
to organismal, is far from being straightforward as it has 
become clear already in the pre-genomic era [18]. Comparative 
genomics reinforced the complex relationships between the 
different levels of complexity in the most convincing manner 
by demonstrating the lack of a simple link between genomic 
and organismal complexities [19]. Suffice it to note that the 
largest bacterial genomes encompass almost as many genes as 
some "obviously" complex animals, such as for example flies, 
and more than many fungi. One of the implications of these 
comparisons is that there could be other measures of genomic 
complexity that might complement the number of conserved 
genes and perhaps provide a better proxy for organismal 
complexity. For example, in eukaryotes, a candidate for such a 
quantity could be the intron density that reflects the potential 
for alternative splicing [20]. 

Genomic complexity is far easier to quantify than 
phenotypic complexity (even if the latter is easier to recognize 
intuitively). Indeed, the remarkable progress of genome 
sequencing, combined with the development of computation- 
al methods for advanced comparative genomics, provides for 
increasingly reliable reconstruction of ancestral genomes 
which transforms the study of the evolution of complexity 
from being a speculative exercise to becoming an evidence- 
based research direction. Here, we examine the results of 
such reconstructions and make an argument that reductive 
evolution resulting in genome simplification is the quantita- 
tively dominant mode of evolution. 

Genome reduction pervades evolution 

A reconstruction of genome evolution requires that the genes 
from the analyzed set of genomes are clustered into 
orthologous sets that are then used to extract patterns of 
gene presence-absence in the analyzed species. The patterns 
are superimposed on the evolutionary tree of these species 
and the gene compositions of the ancestral forms as well 
as the gene losses and gain along the tree branches are 
reconstructed using either maximum parsimony (MP) or 
maximum likelihood (ML) methods (see Box 1) [21-24]. The ML 
methods yield much more robust reconstructions than the 
MP methods but also require more data. Similar methods 
can be applied to reconstruct evolution of other features 
for which orthologous relationships can be established, e.g. 
intron positions in eukaryotic genes. 

Certainly, we are far from being able to obtain compre- 
hensive evolutionary reconstructions for all or even most life 
forms. Nevertheless, reconstructed evolutionary scenarios are 
accumulating, some of them covering wide phylogenetic 
spans, and many of these reconstructions point to genome 
reduction as a major evolutionary trend (Table 1). The most 
dramatic but also the most obvious are the evolutionary 
scenarios for intracellular parasitic and symbiotic bacteria 
that have evolved from numerous groups of free-living 
ancestors. A typical example is the reductive evolution of 
the species of the intracellular parasites Rickettsia from the 
ancestral "Mother of Rickettsia" [25, 26]. Reductive evolution 
of endosymbionts can yield bacteria with tiny genomes 
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Reconstruction of ancestral genomes: 
Maximum parsimony and maximum 
likelihood approaches 



Dollo Parsimony. Only one gain per character is 
allowed; the pattern of losses, sufficient to produce 
the observed presence-absence pattern, with the 
minimum number of losses, is selected [86, 87]. 
Weighted Parsimony. The relative gain-to-loss weight 
is set prior to reconstruction; the pattern of losses and 
gains with the minimum weighted score, sufficient to 
produce the observed presence-absence pattern, is 
selected [88-90]. 

Maximum Likelihood. Gain and loss probabilities 
per unit of time (possibly different for different tree 
branches) are the parameters; the presence-absence 
pattern and tree branch lengths are observed; the set of 
parameters and the gain-loss pattern, maximizing the 
likelihood of the observed presence-absence pattern, is 
selected [21-24]. 



Dollo Weighted Maximum 

Parsimony Parsimony Likelihood 




The figure illustrates the analysis of a pattern for a single 
character. Red lines indicate branches with losses; 
green lines indicate branches with gains. 



consisting of 150-200 genes and lacking some essential genes 
such as those encoding several aminoacyl-tRNA synthetases, 
which is suggestive of an ongoing transition to an organelle 
state [3]. Indeed, the ultimate cases of reductive evolution 
involve the mitochondria and chloroplasts that have lost 
nearly all ancestral genes (e.g. 13 out of the several thousand 
genes in the ancestral alpha-proteobacterial genome are 
retained in animal mitochondria) or literally all genes in the 
case of hydrogenosomes and mitosomes [27]. Certainly, in this 
case, the evolutionary scenario appears as ultimate reduction 
"from the point of view" of the symbiont; the complexity of the 
emerging chimeric organism drastically increases, both at the 
genomic and at the phenotypic level, and it has been argued 
that such complexification would not have been attainable if 
not for the endosymbiosis [5, 28]. Furthermore, hundreds of 
genes, in the case of the mitochondrion, and even thousands 
in the case of the chloroplast, were not lost but rather 
transferred from the endosymbiont genome to the nuclear 
genome of the host [29-31]. 



Deep genome reduction, with the smallest sequenced 
genome of only 2.9 Mb, is also observed in Microsporidia, 
the eukaryotic intracellular parasites that appear to be highly 
derived fungi [32]. The most extreme genome reduction 
among eukaryotes is observed in nucleomorphs which are 
remnants of algal endosymbionts present in cryptophytes 
and chlorarachniophytes and retain only a few hundred 
genes [33]. 

Beyond parasites and symbionts, reductive evolution was 
observed in several groups of organisms that evolved a 
commensal life style. One of the best-characterized cases 
involves the Lactobacillales, a group of Gram-positive bacteria 
that is extremely common in a variety of animal- and plant- 
associated habitats. A maximum parsimony reconstruction 
revealed substantial gene loss, from ~3,000 genes in the 
common ancestor of Bacilli to ~1, 300-1, 800 genes in various 
Lactobacilli species [34, 35]. The genes apparently have been 
lost in a stepwise manner, with substantial loss associated 
with each internal branch of the tree and most but not all of 
the individual species. These losses were only to a small extent 
offset by inferred gain of new genes. 

Certainly, the evolution of the genomes of parasites, 
symbionts and commensals is not a one-way path of 
reduction. On the contrary, the reduction ratchet is con- 
strained by the advantages of retaining certain metabolic 
pathways that complement the host metabolism [36, 37]. 
Notably, mathematical modeling of the evolution of the insect 
endosymbiont Buchnera aphidicola showed that metabolic 
requirements could determine not only the end point of 
genomic reduction but to some extent also the order of the 
gene deletion [38]. Moreover, the reductive trend is countered 
by proliferation of genes involved in parasite-host interaction 
such as, for example, ankyrin repeat proteins that act as 
secreted virulence factors [39, 40]. Quantitatively, however, 
in most parasites and symbionts, these processes make 
a relatively minor contribution compared to the massive 
genome reduction. 

An evolutionary reconstruction for Cyanobacteria, an 
expansive bacterial phylum that consists mostly of free- 
living forms and includes some of the most complex 
prokaryotes, produced mixed results, with several lineages 
characterized by genome expansion [41]. Nevertheless, even 
in these organisms, evolution of one of the two major 
branches was dominated by extensive genes loss, and 
several lineages were mostly losing genes in the other major 
branch. 

Conceivably, the most compelling evidence of the 
dominance of genome reduction and simplification was 
obtained through the reconstruction of the genomic evolution 
of archaea that almost exclusively are free-living organ- 
isms [17, 42]. The latest ML reconstruction based on a 
comparative analysis of 120 archaeal genomes traced between 
1,400 and 1,800 gene families to the last common ancestor of 
the extant archaea [42]. Given the fractions of conserved and 
lineage-specific genes in modern archaeal genomes, this 
translates into approximately 2,500 genes in the ancestral 
genome, which is a larger genome than most of the extant 
archaea possess (Fig. 1). The reconstructed pattern of gene 
loss and gain in archaea is non-trivial: there seems to have 
been some net gene gain at the base of each of the major 
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Table 1. Reconstructions of the genome evolution for major groups of prokaryotes and eukaryotes 



Taxa 


Depth of evolutionary 
reconstruction 


Subject of 

evolutionary 

reconstruction 


Outcome 


Reference 


Mitochondria 


Proto-mitochondrial 
(alpha-proteobacterial) 
endosymbiosis, presumably, 
last common ancestor of 
eukaryotes 


Genes 


Deep reduction, to the point 
of genome elimination in 
anaerobic protists containing 
hydrogenosomes or mitosomes. 


[4, 5] 


Lactobacillales 


Last common ancestor of 
bacilli 


Gene Families 


Complex ancestor; dominance of 
the reduction mode in all lineages 


[34, 35] 


Anoxybacillus flavithermus 


Last common ancestor of 
Firmicutes 


Gene families 


Ancestral complexification, then 
reduction 


[85] 


Rickettsia 


Last common ancestor 
("mother") of rickettsia 


Genes 


Complex ancestor, dominance of 
the reduction mode in all lineages 


[25, 26] 


Cyanobacteria including 


Last common ancestor of 


Genes 


Complex ancestor, complexification 


[41] 


chloroplasts 


cyanobacteria 




in some lineages, reduction in other 
lineages, ultimate reduction in 
chloroplasts 




Archaea 


Last archaeal common 
ancestor 


Gene families 


Moderately complex ancestor, 
ancestral complexification in some 
lineages, more recent dominance of 
genome reduction in all lineages 


[42] 


Eukaryotes 


Last eukaryotic common 


Protein domain 


Complex ancestor, reduction of the 






ancestor 


families 


domain repertoire in most lineages, 
expansion only in multicellular 
organisms 




Eukaryotes 


Last common ancestor of 
eukaryotes 


Introns 


Complex early ancestors, mostly 
reductive evolution, complexification 
in some, primarily multicellular 
lineages 


[20, 53] 


Microsporidia 


Last common ancestor of 
microsporidia 


Genes 


Complex ancestor, deep reduction 


[32] 



archaeal branches that was almost invariably followed by 
substantial gene loss; as discussed below, this could be a 
general pattern of genome evolution. The notable exceptions 
are Halobacteria and Methanosarcinales, the two archaeal 
lineages in which evolution was strongly impacted by 
horizontal gene transfer from bacteria [43, 44] that offset 
the gene loss and led to genome expansion (Fig. 1). Although 
less reliable than the genome-wide ML reconstructions, 
attempts on the reconstruction of the ancestral state of 
specific functional systems seem to imply even more striking 
complexity of archaeal ancestors. For example, comparative 
analysis of the cell divisions machineries indicates that the 
common ancestor of the extant archaea might have possessed 
all three varieties of the division systems found in modern 
forms [45]. 

Reconstructions of the evolution of eukaryotic genomes 
yielded expanding ancestors as the number of diverse 
genomes available for comparative analysis grew. At least 
until recently, the available collection of eukaryotic genomes 
remained insufficient for reliable ML reconstruction. However, 
maximum parsimony reconstruction traced between 4,000 
and 5,000 to the last eukaryotic common ancestor (LECA) 
[46, 47]. An even simpler analysis identified over 4,000 genes 
that are shared between Naegleria gruberi, the first free-living 
excavate (one of the supergroups of unicellular eukaryotes 
that also includes parasitic forms such as trichomonas and 



giardia) for which the genome was sequenced and at least one 
other supergroup of eukaryotes, suggesting that these genes 
were inherited from the LECA [48, 49]. Such estimates are 
highly conservative as they disregard parallel gene loss in 
different major lineages, an important phenomenon in the 
evolution of eukaryotes. Indeed, even animals and plants, 
the eukaryotic kingdoms that seem to be the least prone to 
gene loss, have lost about 20% of the putative ancestral genes 
identified in the unicellular Naegleria. Collectively, these 
findings imply that the genome of the LECA was at least 
as complex as the genomes of typical extant free-living 
unicellular eukaryotes [50]. Even more striking conclusions 
were reached by the reconstruction of the evolution of the 
eukaryotic protein domain repertoire that involved compari- 
son of 114 genomes [51]. The results of this reconstruction 
indicate that most of the major eukaryotic lineages have 
experienced a net loss of domains that have been traced to 
the LECA. Substantial increase in protein complexity appears 
to be associated only with the onset of the evolution of the 
two kingdoms of multicellular eukaryotic organisms, plants, 
and animals. 

Remarkably congruent results have been obtained in 
reconstructions of the gain and loss of introns in eukaryotic 
genes. In this case, the availability of thousands intron 
positions provide for the use of powerful ML methods. 
The reconstructions consistently indicate that ancestral 
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Figure 1 . Reconstruction of the evolution of the archaea.The color 
code indicates the number of genes that belongs to clusters of 
archaeal orthologous genes (arCOGs) in the extant genomes and 
the reconstructed number of arCOGs in the ancestral forms [42]. 
The figure is modified from [42], 



eukaryotes including the LECA and the founders of each 
supergroup were intron-rich forms, with intron densities 
higher than those in the genes of most extant eukaryotes and 
probably only slightly lower than those in the modern 
organisms with the most complex gene structures, such as 
mammals [20, 52, 53] . Remarkably, intron-rich ancestors were 
reconstructed even for those major groups of eukaryotes that 
currently consist entirely of intron-poor forms such as the 
alveolates that apparently evolved via differential, lineage- 
specific, extensive intron loss [54]. All in all, intron loss 
clearly dominated the evolution of eukaryotic genes, with 
episodes of substantial gain linked only with the emergence 
of some major groups, especially animals [20, 53], in full 
agreement with the results of the evolutionary reconstruction 
for the eukaryotic domain repertoire [51]. As previously 
pointed out by Brinkmann and Philippe [55], simplification 
could be an "an equal partner to complexification" in the 
evolution of eukaryotes. The latest reconstruction results 
suggest that simplification could be even "more equal" than 
complexification. 



Both neutral and adaptive routes lead to 
genome reduction 

Genome reduction in different life forms seems to have 
occurred via two distinct routes: (i) the neutral gene loss 
ratchet and (ii) adaptive genome streamlining [8, 56]. 
Typically, the reductive evolution of intracellular pathogens 
does not seem to be adaptive inasmuch as the gene loss does 
not appear to occur in parallel with other trends suggestive of 
streamlining such as shrinldng of intergenic regions or intense 
selection on protein-coding sequences manifest in a low Ka/Ks 
ratio. On the contrary, the intracellular bacteria appear to 
rapidly evolve under weak selection [3, 57]. The lack of 
correlation between different genomic features that are 
generally viewed as hallmarks of adaptive genome stream- 
lining (i.e. selection for rapid replication), along with the 
presence of numerous pseudogenes that seem to persist for 
relatively long time spans and similarly persistent mobile 
elements [3, 58-60], implies that in these organisms genomic 
reduction stems from neutral ratchet-like loss of genes that 
are non-essential for intracellular bacteria. This route of 
evolution conceivably was enabled by the virtual sequestra- 
tion of intracellular parasites and symbionts from HGT and by 
the ensuing reduction of the effective population size [61-63]. 
This apparent non-selective mode of gene loss is compatible 
with the small effective population size of parasites and 
symbionts, which results in an increased evolutionary role of 
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genetic drift and infeasibility of strong selection [64, 65]. On a 
long-term evolutionary scale, these organisms are likely to be 
headed for extinction due to the diminished evolutionary 
flexibility that reduces their chance of survival in case of 
environment change [66]. Coming back to the definitions 
introduced above, in the evolution of parasites and symbionts, 
the decrease in the biological complexity of genomes occurs in 
parallel with the decrease in information density. 

However, bona fide adaptive genome streamlining 
appears to be a reality of evolution as well. Features of such 
streamlining are detectable in the genomes of the highly 
successful free-living organisms such as the cyanobacterium 
Prochlorococcus sp. [67, 68] and the alpha-proteobacterium- 
Candidatus Pelagibacter ubique, apparently the most 
abundant cellular life forms on earth [56, 69, 70]. These 
bacteria possess highly compact genomes and evolve under 
strong purifying selection suggesting that in these cases, the 
loss of non-essential genes, mobile elements and intergenic 
regions is indeed driven by powerful selection for rapid 
genome replication and minimization of the resources 
required for growth. Genome evolution of these highly 
successful life forms involves a drop in the overall complexity 
but an increase in information density. Of course, all the 
pressure of genome streamlining notwithstanding, the 
lifestyle of these free-living, autotrophic organisms imposes 
non-negotiable constraints on the extent of gene loss in these 
organisms because they have to maintain complete, even if 
minimally diversified metabolic networks. Additionally, an 
important factor in the evolution of these organisms that 
dwell in microbial communities could the "Black Queen 
effect" whereby selection operates at the community level 
so that otherwise essential genes can be lost as long as the 
respective metabolites or other commodities are provided to 
some community members [56, 71]. 

Reconstructions of genome evolution in both prokaryotes 
and eukaryotes indicate that the loss of genes and introns 



typically occurs roughly proportionally to time, thus con- 
forming with a form of genomic molecular clock [53, 72-75]. In 
contrast, the gain of genes and introns appears to be sporadic 
and mostly associated with major evolutionary innovations, 
such as in particular the origin of animals and plants. Thus, it 
has been concluded that gene loss is mostly neutral, within 
the constraints imposed by gene-specific purifying selection, 
whereas gene gain is controlled by positive selection [75]. The 
former conclusion seems to be robust whereas the latter is 
dubious as gene gain in transitional epochs could be more 
plausibly attributed to genetic drift enabled by the population 
bottlenecks that are characteristic of these turbulent periods 
of evolution [8, 65, 76]. 

In cases of both neutral and adaptive genome reduction, 
this process appears to involve specialization contingent on 
environmental predictability whereas the bursts of innovation 
considerably opens up multiple new niches for exploration by 
evolving organisms. 

A biphasic model of evolution 

The findings that in many if not most lineages evolution is 
dominated by gene (and more generally, DNA) loss that occurs 
in a roughly clock-like manner whereas gene gain occurs in 
bursts associated with the emergence of major new groups 
of organisms imply a biphasic model of evolution (Fig. 2). 
Under this model, the evolutionary process in general can be 
partitioned into two phases of unequal duration: (i) genomic 
complexification at faster than exponential rate that is 
associated with stages of major innovation and involves 
extensive gene duplication, gene gain from various sources, in 
particular horizontal gene transfer including that from 
endosymbionts, and other genomic embellishments such as 
eukaryotic introns, and (ii) genomic simplification associated 
with the gradual loss of genes and genetic material in general, 




Figure 2. The biphasic model of punctuated evolu- 
tion of genomes. Top: Periods of compressed 
cladogenesis punctuating long phases of quasi- 
stasis in the history of a particular lineage. Bottom: 
Complexity profile. The vertical axis implies the 
biological complexity of genomes that can be 
expressed as the number of sites or genes that are 
subject to selection. The green background indi- 
cates the complexification phase and the red 
background indicates the reduction phase.The 
dashed lines indicate the super-exponential growth 
rate in the complexification phase. 
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typically at the rate of exponential decay. The succession of 
the two phases appears to be a recurrent pattern that defines 
the entire course of the evolution of life. The first, innovative 
phase of evolution is temporally brief, engenders dramatic 
genomic and phenotypic perturbations, and is linked to 
population bottlenecks. The second, reductive phase that 
represents "evolution as usual" is protracted in time, is 
facilitated by the deletion bias that seems to be a general 
feature of genome evolution [77-79], and is associated either 
with a continuously small effective population size, as in 
parasites and symbionts with decaying genomes, or with 
evolutionary success and increasing effective population size 
as in free-living organisms undergoing genome streamlin- 
ing [56, 57, 64]. Clearly, the reductive phase of evolution is not 
limited to the loss of genes that were acquired in a preceding 
burst of innovation. An excellent case in point is the evolution 
of eukaryotes, where the explosive phase of eukaryogenesis 
yielded duplications of a substantial number of genes. Many 
of these gene duplicates diversified and persisted throughout 
the course of eukaryote evolution whereas numerous other 
genes were lost in multiple lineages [46, 47, 51]. 

Interestingly, detailed reconstruction of the independent 
processes of reductive evolution in several parasitic bacteria 
appears to reveal a "domino effect" that, on a much smaller 
evolutionary scale, causes punctuation in reductive evolution 
itself [80]. It appears that the gradual, stochastic course of 
gene death is punctuated by occasional bursts when a gene 
belonging to a functional module or pathway is eliminated, 
rendering useless the remaining genes in the same module or 
pathway. 

Certainly, the biphasic model of evolution depicted in 
Fig. 2 is not all-encompassing as continuous, long-term 
increase in genome complexity (but not necessarily biological 
information density) is observed in various lineages, our own 
history (that is, evolution of vertebrates) being an excellent 
case in point. Nevertheless, to the best of our present 
understanding informed by the reconstructions of genome 
evolution, extensive loss of genetic material punctuated by 
bursts of gain is the prevailing mode of evolution. 

The biphasic model of evolution presented here expands 
on the previously developed scenario of compressed clado- 
genesis [81-83]. It also conceptually reverberates with Gould's 
and Eldredge's punctuated equilibrium model [84], where 
the periods of "stasis" actually represent relatively slow 
genome dynamics that in many if not most lines of descent is 
dominated by the loss of genetic material. 

Conclusions and outlook 

The results of evolutionary reconstructions for highly diverse 
organisms and through a wide range of phylogenetic depths 
indicate that contrary to widespread and perhaps intuitively 
plausible opinion, genome reduction is a dominant mode 
of evolution that is more common than genome complex- 
ification, at least with respect to the time allotted to these 
two evolutionary regimes. In other words, many if not most 
major evolving lineages appear to spend much more time 
in the reductive mode than in the complexification mode. 
The two regimes seem to differ also qualitatively in that 



genome reduction seems to occur more or less gradually, in a 
roughly clock-like manner, whereas genome complexification 
appears to occur in bursts accompanying evolutionary 
transitions. Genome reduction apparently occurs in two 
distinct and distinguishable manners, i.e. either via a neutral 
ratchet of genetic material loss or by adaptive genome 
streamlining. 

Despite the diversity of the available case stories of 
reductive evolution, the current material is obviously 
insufficient for an accurate estimation of the relative 
contributions of genome reduction and complexification to 
the evolution of different groups of organisms. To derive such 
estimates, evolutionary reconstructions on dense collections 
of genomes from numerous taxa are required. Even more 
detailed analysis including careful mapping of loss and gain 
of genetic material to specific stages of evolution is necessary 
to refute or validate the model of punctuated genome 
evolution outlined here. In a more abstract plane, a major 
goal for future work is the development of a rigorous theory to 
explain biphasic evolution with the populations' dynamic 
framework. 
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